HYPERFLEET-971 - feat: reject nodepool create/patch on soft-deleted cluster #113
HYPERFLEET-971 - feat: reject nodepool create/patch on soft-deleted cluster #113kuudori wants to merge 2 commits intoopenshift-hyperfleet:mainfrom
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
WalkthroughThis PR adds a new RFC 9457 conflict error code Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant Handler as ClusterHandler / NodePoolsHandler
participant Service as ClusterService
participant Errors as errors.Registry
Client->>Handler: HTTP PATCH/POST (cluster/nodepool)
Handler->>Service: Get(clusterID)
Service-->>Handler: cluster (maybe nil)
alt cluster missing
Handler-->>Client: 404 Not Found
else cluster present
alt cluster.DeletedTime != nil
Handler->>Errors: ConflictState("marked for deletion", ...)
Errors-->>Handler: ProblemDetails (409, HYPERFLEET-CNF-003)
Handler-->>Client: 409 Conflict (problem-details)
else cluster active
opt nodepool flow: check nodepool.DeletedTime
alt nodepool.DeletedTime != nil
Handler->>Errors: ConflictState("marked for deletion", ...)
Errors-->>Handler: ProblemDetails (409,...)
Handler-->>Client: 409 Conflict
else nodepool active
Handler->>Service: Replace/Create(...)
Service-->>Handler: updated resource
Handler-->>Client: 200/201 with body
end
end
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Review rate limit: 9/10 reviews remaining, refill in 6 minutes. Comment |
|
accidentally clicked generate unit tests under the coderabbit comment, deleted his comments with an unsuccessful attempt to do that |
|
Nice work! The soft-delete guards on the cluster-scoped endpoints look solid. One thing I noticed: the standalone Worth adding the same guard there for consistency — or at least tracking it as a follow-up. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/handlers/cluster_nodepools.go`:
- Around line 193-200: The soft-delete check on cluster.DeletedTime in the
handler (after clusterService.Get) is racy; move this validation into the
service/DAO layer and enforce it atomically with the write. Update the nodepool
creation/replacement paths (the service methods that call Create()/Replace() on
the DAO) to either re-check DeletedTime inside the same transaction or add a
precondition to the DB write (e.g., include WHERE cluster.deleted_time IS NULL
in the INSERT/UPDATE SQL) so Create()/Replace() fail if the cluster is marked
deleted; ensure the service/DAO returns a clear ConflictState error when that
condition is hit.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Enterprise
Run ID: 9ca03664-3434-4b51-9885-8613945b3352
📒 Files selected for processing (5)
pkg/errors/errors.gopkg/handlers/cluster.gopkg/handlers/cluster_nodepools.gopkg/handlers/cluster_nodepools_test.gopkg/handlers/cluster_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
- pkg/errors/errors.go
- pkg/handlers/cluster.go
| cluster, err := h.clusterService.Get(ctx, clusterID) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
|
|
||
| if cluster.DeletedTime != nil { | ||
| return nil, errors.ConflictState("Cluster '%s' is marked for deletion", clusterID) | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Inspect NodePool/Cluster service implementations for atomic soft-delete enforcement on write paths.
rg -nP --type go 'func \(.*\) (Create|Replace)\(' -C 4
rg -nP --type go 'DeletedTime|ConflictState|transaction|tx\b|FOR UPDATE|optimistic' -C 4Repository: openshift-hyperfleet/hyperfleet-api
Length of output: 50392
🏁 Script executed:
# Find Create/Replace implementations in service and DAO layers with DeletedTime checks
rg -n 'func.*Create\(|func.*Replace\(' pkg/services pkg/dao -A 20 | head -100Repository: openshift-hyperfleet/hyperfleet-api
Length of output: 6057
🏁 Script executed:
# Search for DeletedTime checks in Create/Replace context
rg -B 5 -A 15 'DeletedTime' pkg/services pkg/dao | grep -E '(func|Create|Replace|DeletedTime)' | head -50Repository: openshift-hyperfleet/hyperfleet-api
Length of output: 3722
🏁 Script executed:
# Look at the actual handler code around line 193-200 and 269-271
sed -n '190,210p' pkg/handlers/cluster_nodepools.go
sed -n '265,275p' pkg/handlers/cluster_nodepools.goRepository: openshift-hyperfleet/hyperfleet-api
Length of output: 996
🏁 Script executed:
# Check if the write operations (Create/Replace) in services revalidate state
rg -A 30 'func \(s \*sqlClusterService\) Create\(' pkg/services/cluster.go
rg -A 30 'func \(s \*sqlNodePoolService\) Create\(' pkg/services/node_pool.goRepository: openshift-hyperfleet/hyperfleet-api
Length of output: 1760
The soft-delete check must be re-validated atomically in the write path.
The handler's Get and subsequent Create/Replace are separate operations, allowing a cluster to be marked for deletion between the check (line 195–197) and the write. The service and DAO implementations skip re-validation—Create() and Replace() proceed without re-checking DeletedTime.
Move the validation into the service or DAO layer and enforce it transactionally with the write, or add an explicit precondition check (e.g., WHERE cluster.deleted_time IS NULL) in the database write to prevent orphaned nodepools.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/handlers/cluster_nodepools.go` around lines 193 - 200, The soft-delete
check on cluster.DeletedTime in the handler (after clusterService.Get) is racy;
move this validation into the service/DAO layer and enforce it atomically with
the write. Update the nodepool creation/replacement paths (the service methods
that call Create()/Replace() on the DAO) to either re-check DeletedTime inside
the same transaction or add a precondition to the DB write (e.g., include WHERE
cluster.deleted_time IS NULL in the INSERT/UPDATE SQL) so Create()/Replace()
fail if the cluster is marked deleted; ensure the service/DAO returns a clear
ConflictState error when that condition is hit.
Prevent nodepool creation and updates on clusters marked for deletion by returning 409 Conflict, avoiding orphaned resources after cluster cleanup.
Summary
Test Plan
make test-allpassesmake lintpassesmake test-helm(if applicable)Summary by CodeRabbit
Bug Fixes
New
Tests